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(54) Database method and apparatus using hierarchical bit vector index structure 



(57) A database having fixed length records stored 
together in record number order and an index structure 
for the database. The index structure comprises a sep- 
arate index for each searchable field of the records. For 
purposes of indexing, the records are logically divided 
into fine slices of 8,000 records each, and the fine slices 
are grouped into coarse slices of 4,000 fine slices each. 
The indexes include fine and coarse keys, each of which 
corresponds to a particular data value and a particular 
fine or coarse slice. Associated with each key is a link 
that is used to determine which records contain the data 
value. For the fine keys, the link includes a pointer to a 
bit vector that has a single bit for each of the records 

FIG. 3 

IjoqImI Separation of Ran* 



within the fine slice associated with the key. For the 
coarse keys, the link includes a pointer to a bit vector 
that has a single bit for each of the fine slices contained 
in the coarse slice. The coarse bit vector comprises two 
bit vectors, one that identifies which of the fine slices 
within the coarse slice contains any records having the 
data value and one that identifies which fine slices, if 
any, contain the data value in every one of its records. 
The keys are stored in a B-tree in order of a unique key 
value so that they are processed in record number order 
and the resulting list of records for any query is gener- 
ated in record number order. 
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Description 

TECHNIP.Al fifi n 

[0001] This invention relates in general to techniques 
for storing and retrieving data on digital computers and 
in particular, to an improved method and apparatus for 
fast storage and retrieval of data from very large data- 
bases using multiple retrieval keys and complex 
retrieval criteria. 

BACKGROI IMn OF THF IMVFMT.r >M 

[0002] As the processing speed and storage capacity 
of digital computers continues to increase, so does their 
suitability for management of very large databases To 
be useful, a database typically needs to be searchable 
so ft* a user can perform keyword searching of the 
database and retrieve information associated with the 
keywords. For large databases, linear searching of the 
data for keywords is typically too time consuming to be 
useful Consequently, large databases are often 
indexed to provide fast query processing and retrieval of 
data. 

[0003] As is known, in most databases, data is organ- 
ized logically and often physically into individual fields 
within a record, with each record representing a collec- 
tion of data about an object such as a patient, a vehicle 
or a document and each field within the record contain- 
ing a different item of data about the object, such as the 
patient's last name, first name. age. gender, etc. To pro- 
vide flexibility, the database management program must 
not only permit single keyword searches of the data 
fields, but must support more complex queries, includ- 
mgbcolean expressions (e.g.. AND, OR. NOT) of key- 

[0004] Often, a database will include both high and 
ow cardinality data; that is. the database will include 
types of data that can have many different values as 
well as types of data that will have very few different val- 
ues. Examples of low cardinality data include a person's 
gender (male or female), political party affiliation, or the 
make of a vehicle. Examples of high cardinality data 
;nclirie such things as last names, birth dates, vehicle 
.dertrf.cat.on numbers (VIN). and social security num- 
bers (SSN). the last two of which will likely include a 
unique value in every record in the database. For opti- 
mum flexibility, a database index must permit efficient 
searching for both high and low cardinality data How- 
ever . typical storage methods are designed and opti- 
mized for retrieval of high cardinality data only, using 
single or compound key indexes. 
[0005] One way of indexing a database is through the 
use of keys that provide a fast way to locate soecific 
.terns of data within the database. Tnese 
by a database management program to locate and 
retrieve records from the database that contain the data 
associated with the keys. Bitmaps or bit vectors are 



sometimes used along with keys to identify which 
records contain a particular Hern of data within a oT2c 
£ ^ ™~ » vectors comprise a stiK 
2? T* * re P resentin 9 a single record in the 
5 database. A particular bit vector will represent a parte- 
ular rtem o data that is found in a particular fielS in at 
least one of the records of the database. The presence 

M^f£r* bit set t0 a ,09ical one) in the bit vector 

nd.cates that the record associated with that bit con- 

where a database contains patient records that include 
a "first name field, a bit vector of "0010100000" that is 

STr?* th6 nam9 " E "' Zabeth " *» *• ■** "«£ 
AM «B .ndicate that the third and fifth records are for 
is patients named Elizabeth. «*»«ireror 

[0006] Sometimes, in addition to indexing a database 
fte datebase itself is stored along with thj hJSS 

SZZT" 9 ,h ! database as ■ separately stored 
structure. For example, in US. Patent No. 4.606.002 to 

» ^Wa.smanetal..aB-treeisusedtostoreftedata! 

£? 2? alon9 with an inverted B " tree index *■» 

keys and associated sparse array bit maps to identify 
which records ; contain particular items oMata. A £3 

,5 T ? ' dentify diffSrent tables of recore,s <•*. 

* a table of patient records versus a table of doctor 

records) and to distinguish between the database data' 
and ft. indexes. The data is stored using odd num- 
bered record identifiers and the associated indexes are 

?* te L anda ^ iat ^^ ar ^a^ybitmapareasso- 
aated wrth arange of records and are stored together in 

f?JL I ^ ^ com P rises a record identifier, 
field identifier, data value, and range value, in that order 

» U? 8 r8C ? *2? m ier indicates to **** <* data the 

X2?Vl ,Wd Wenti "' er is a numeric i 
Je field within the records to which the key relates The 

data value is the actual item of data contained in at least 

one of the records to which the key relates. And the 

40 £?,!iT i 61 ™* 5 thS ran9e 01 records to "»*h the 
key relates. The sparse array bit map includes three lev- 

SiSLirS* b ° tt0m ,Wel com P rises a nur *er 
ofone-byte bit vectors, with the individual bits of each bit 
vector indicating which records within the associated 
range of records contain the associated data value in 
45 the held specified by the fieU fcientif ier. The middle level 
contains one-byte bit vectors in which each bit repre- 
sents one of the one-byte bit vectors of the bottom level. 
A bit in the .middle level bit vectors is set (to a logical 
one) if any of the bits in the bottom level bit vector ftat it 
so represents is set; otherwise it is zeroed. This same 
structure i is carried out to the upper level which includes 
a single byte representing all of the middle level bit vec- 
tors. Where a bit vector would not include any set bits 

ss !!^ h ^ Hs absence Is Wi- 

55 cated by its associated bit from the next higher level 
being zeroed. 

[0007] As noted by Waisman et al.. the use of bit vec- 
tors simplifies the processing of boolean expressions 
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since two bit vectors can be combined in accordance 
with the specified boolean operator and the resulting bit 
vector represents the records that satisfy the boolean 
expression. One problem in combining upper level bit 
vectors in which the bits do not represent individual 
records is in the processing of NOT expressions. Since 
a set bit indicates only that at least one (but not neces- 
sarily all) records in the associated range of records 
includes the data value, the NOT operation cannot be 
accomplished simply by inverting the bit. Rather, as dis- 
cussed in Waisman et al.. the lower level bits that repre- 
sent individual records must be inverted and used 

S ^ in Waisman et al - * e d ata is stored at 
the bottom level of a B-tree. consecutively numbered 
records need not be physically stored together This 
permrts the use of variable length fields and permits 
fields to be added to the database without having to 
reorganize the database or make changes to the data- 
base management program that is used to access the 
data. However, this type of database structure can com- 
plicate retrieval of records, since the fields of any one 
record may not all be physically stored together and 
since separate I/O accesses may often be required for 
retneval of even a small group of consecutive records 
Given the relative slowness of I/O access, the data stor- 
age structure utilized by Waisman et al. can result in 
undesirably slow retrieval of records. 
[0009] Accordingly, it is an object of the invention to 
prov.de a method and apparatus for managing large 
amounts of data in a manner that provides the following 



Each of the records includes a number of data fields 
with at feast some of the data fields having a date vLlue 
stored therein. The records are logically separata^ 
groups or fine shoes, of 

5 conta : n,n i 3 prese,ected «*Mn numbTn * 

or ™e ^ ,ine ^ ar9 ,0fliCa " y * «2 

or more sets, or coarse slices, with each coarse slice 

containing a preselected maximum number m of fine 

slices, whereby each coarse slice contains a malum 

1 nn^' ! aCh 01 "** is associated "»» a differ- 
ent one of the data fields and each of which comprises 
a number of keys. Each of the data values Ke 
s^edwrthinanindexeddatafieldhasoneormoretoe 

^ZV teys '"ere 

^l^T'r"'^ aSS0Ciat6d ^ 009 of f ™ 
J»!i!£ 9 b,t Vector * at identifie s which of 

me records contained within that fine slice include the 
data value associated with that key. The coarse kZ 
20 are .each associated with one of the coarse diciaS 
wrth a coarse bit vector that identifies which o Z Z 

25 BR'EF DESCRIPTION OP TUr= n B ^ mm 

I-li • \ Preferred embodiment of the 

present invention will hereinafter be described in con- 

30 ignations denote like elements, and: 



1 • Very fast query response; 

2. Fast update response; 

3. Support for complex retrieval operations, indud- ss 
mg the combination of multiple keys using boolean 
logic, wrth equally fast retrieval for all boolean oper- 
ators; ^ 

4. Minimization of the number of key indexes 
required; 

5 Minimization of the storage space required for 
the database index; 

6. A query response time that is proportional to the 
number of data items retrieved, rather than the 
number of data items stored in the database- « 
7^An extremely fast count operation for preview of 
retneval queries and fine-tuning of retrieval criteria- 
and 1 

8. An ability to handle both high and low cardinality 
data well in a single unified key index structure. so 
SUMMARY OF THE iMx/ CffrrN 

k ^IJH? 0 " 3 ? 3 " 06 present inve "ti°n there 

is provKled a database system that overcomes the 55 

^th" 0 !^ Pr0W9mS * ^^ent of large data 
sets. The data comprises a plurality of records stored on 
a data storage device in a computer-readable format 



^ 8tructure of an ^emp'ary 
embodiment of a database used in the preserj 

invention; 

Figure 2 is an overview of query processing using a 
key-based index of the present invention; 

Figure 3 depicts the logical separation of data 
records .nto coarse and fine slices for use in gener- 
ating database indexes in accordance with the 
present invention; 

Figure 4 is a diagrammatic representation of sam- 
ple veh IC le color data for each of a number of 
records from the database of Fig. 1 ; 

Figure 5 depicts the general form of the structure of 
an embodiment of an index constructed in accord- 
ance with the present invention; 

Figure ^6 depicts the index structure for the sample 
data of Fig. 4; 

Figure 7 shows the layout of keys used in the index 
of Figs. 5 and 6; 



Figure 8 shows the layout of the fine links used 



in 
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the index of Figs. 5 and 6: 



Figure 9 shows the layout of the coarse links used 
in the index of Figs. 5 and 6; 

Figure 10 shows the layout of the coarse bit vector 
used in the index of Figs. 5 and 6: 

Figure 1 1 depicts the B-tree structure which is used 
to store the keys and links of the index of Fig. 6; 

Figure 12 is a block diagram of a computer system 
for use in implementing the present invention; 

Figure 13 is a flow chart depicting a process in 
accordance with the present invention for counting 
the number of records that satisfy a given query; 
and 

Figure 14 is a flow chart depicting a process in 
accordance with the present invention for retrieving 
the records that satisfy a given query. 

DESCRIPTION OF THF PREFC q RED gMRQniMFMT 

Data Storage within th ft ito.^ 

[0012] Referring to Fig. 1 . there is shown a database 
20 compnsing a table of records 22. Each of the records 
has a fixed length and is assigned a unique record 
number, starting with zero. This allows the records to be 
stored sequentially in a single file with the location 
within the file of any particular record simply being 
determined by multiplying the record length (in bytes) by 
that record's assigned record number and using the 
resulting value as an offset from the beginning of the 
file. This arrangement provides very simple and fast 
allocation, de-allocation, and re-use of data record 
space within the file. Also, by storing all of the records 
within a single file, entire databases can be easily cre- 
ated or deleted. Creation simply requires creating a 
new. empty file. Deletion simply requires deleting an 
existing file. All user access to the database is by way of 
a database management program which allows the 
user to add, change, and delete data from the database 
As will be discussed below, the database management 
program supports boolean and other query processing 
using key-based indexes that provide fast access to the 
data within the database. 

tOOl 3] As shown in Fig. 1 , the illustrated embodiment 
of the invention includes a database 20 of vehicle infor- 
mation. Each record 22 has an identical format that 
includes a number of program data fields 24 and a 
number of user data fields 26. The program data fields 
include validity, self-id. next-free, and checksum fields 
These fields are not accessible to the user, but are used 
by the database management program as a part of 
managing the database, as will be described below The 



user data fields 26 include the actual data fields used to 
store the database data. In the illustrated embodiment 
these fields include vehicle make, model, year VIN 
engine, transmission, color, and a number of option 
5 fields for storing information about various options a 
vehicle may have; for example, a sun-roof, aluminum 
wheels, side-impact airbags. etc. As will be appreciated 
the user data fields can be specified by the user as a 
part of initially setting up the database, with the user 
10 specifying the size and data type (e.g.. swing, date, inte- 
ger) of each user data field to be included in the data- 
base. All fields, both program data fields and user data 
fields, are fixed length fields so that the records all have 
a fixed length and can be easily accessed using a calcu- 
is lated offset, as described above. 

[0014] The validity data field is used to mark the 
record as either in-use (valid) or deleted (invalid). This is 
used where a record has used and then deleted by the 
useun which case the validity field is used to indicate 
so that the record does not contain valid data and can be 
re-used. The validity field can be a single bit. although a 
byte pattern can preferably be used to prevent loss of 
data due to a corruption of a single bit. The validity field 
ran be used during recovery operations to determine 
25 the validity of data contained in the record's fields This 
field can be indexed along with the user data fields as 
will be described below. The self-id data field contains a 
unique identification of the record. The self-id com- 
prises the file identifier (e.g., filename of the database) 
30 and the record number of the record. This field is used 
by the database management program to verify that the 
record is in fact what it is believed to be. The next-free 
field points to the next free (i.e.. deleted) record using 
the offset of the record to which it points. This field is 
35 used to form a stack of deleted records which can be re- 
used when new data records are added to the data- 
base. Finally, the checksum field contains a checksum 
value computed on the entire contents of the record 
except for the checksum item itself. The checksum pr* 
*o vides validity checking to insure database consistency 
and correctness. 

Indexlnanfth f fte^h^ 

* [0015] To permit keyword searching of user data fields 
withm database 20. all searchable user data fields 26 
are indexed with keys that are used to identify which 
records contain a particular item of data (i.e data 
value) within a particular field. Typical queries might be 
so for example: 

VIN = XY2123 
MAKE o Chevrolet 

MODEL =. Corvette AND YEAR = 1975 
55 ENGINE - V6A OR ENGINE = V8G 

MODEL = Scout AND (Trans = T3 OR Trans = T4) 
AND NOT (COLOR - Grey OR YEAR < 1969) 
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[0016] Fig. 2 provides an overview of how query 
processing is accomplished using these indexes. For 
each indexed data field 24 and 26. a pointer 28 is pro- 
vided which points to an index 30 for that particular data 
field. The pointers are stored in a table 32 that is sepa- 5 
rate from the actual database itself. Using the index 30. 
the database management program obtains a list 34 
(represented by one or more bit vectors) of records 22 
that contain the keyword used in the query. For boolean 
and range searches (i.e.. for multiple keyword queries). 10 
the lists are processed by a query processor module 36 
that compares the lists 34 to each other in accordance 
with the appropriate boolean logic, resulting in a list 38 
of records that satisfy the query. The records are then 
retrieved and. if desired, processed by an optional sort , 5 
40. resulting in a final, sorted query result record list 42. 
[0017] The indexes 30 are actually collections of keys 
stored in a B-tree. In creating the indexes, separate 
keys are generated not only for each user data field, but 
also for each data value that is stored within that data » 
field in at least one of the records. Associated with each 
key is a link that is used to determine which records 
contain the data value associated with the key. To opti- 
mize the retrieval of records based upon query process- 
ing, a hierarchical structure of keys and bit vectors are ss 
used, with each key and bit vector representing no more 
than a certain number of records in the database. Thus, 
when creating the index, multiple keys and bit vectors 
are generated for each data value of each data field and 
the number of keys and bit vectors generated will 30 
depend upon the total number of records in the data- 
base and the distribution of data among those records 
Two types of keys and bit vectors are used: coarse and 
fine, with the fine keys and bit vectors each representing 
a group of consecutive records and the coarse keys and 35 
bit vectors each representing a set of consecutive fine 
bit vectors. 

[0018] The indexing of database 20 using these keys 
and bit vectors will now be described in detail in connec- 
tion with Figs. 3-11. Fig. 3 depicts the logical separation ao 
of records into groups of what will be referred to as fine 
slices, with the fine slices being logically organized into 
sets of what will be referred to as coarse slices. As 
shown in the top portion of Fig. 3, each fine slice com- 
prises 8.000 consecutive records and each coarse slice 45 
comprises 4.000 consecutive fine slices. Accordingly, a 
single coarse slice represents 32 million consecutive 
records in the database. Each fine slice has a unique 
absolute fine slice number which, for a given record k 
and fine slice length fsl. is equal to: OT 



lute coarse slice number that, for a given record k a 

fZlff™ S "' C ' ength 051 ■ and a 9 iven fi "e slice 
length fsl, is equal to: 

k DIV (cs/ x fsl). 

.I h „^ ,0 7!? rd number4 °.«20.973 with a coarse slice 
length of 4.000 and a fine slice length of 8,000, the 
absolute coarse slice number will be 1 (40,420,973 DIV 
32,000,000). 

[001 9] Within any particular fine slice, each record 22 
can be identified by a relative record number which indi- 
cates the position of the record within that particular fine 
slice For a given record k, the relative record number is 

GQUcll TO I 

k MOD fsl, 

where MOD indicates modulus division which returns 
mm?? 'i emainder - T** *» a fine slice length of 
8.000. (absolute) record number 40.420.973 has a rela- 
tive record number of 4.973 (40.420.973 MOD 8.000) 
Similarly, within any particular coarse slice, each fine 
slice can be identified by a relative fine slice number 
which indicates the position of the fine slice within ihat 
particular coarse slice. For a given record *, the relative 
fine slice number is equal to: 

(frMOD (cs/x fe/))DIVfe/, 



k DIV fe/ . 

where DIV indicates integer division which returns the 

!,n e ?o«„ C ili 0tie^, • ThuSl for (a^'ute) record number 55 
40.420.973 and a fine slice length of 8,000, the absolute 
fine slice number will be equal to 5.052 (40.420.973 DIV 
8.000). Similarly, each coarse slice has a unique, abso- 



Thus. for the slice lengths given above, the relative fine 
slice number for record number 40,420.973 would be 
1.052 ((40.420.973 MOD 32.000.000) DIV 8 000) As 
will be discussed further below, the relative record num- 
bers and relative fine slice numbers are used in connec- 
tion with the links associated with the fine and coarse 
keys, respectively. 

[0020] For purposes of illustration. Fig. 4 provides 
sample data from database 20 of Fig. 1 showing differ- 
ent data values stored in the "vehicle color" field 26 of a 
number of records 22 within the database. To aid in 
understanding the index structure. Fig. 4 also includes 
columns listing the absolute coarse and fine slice num- 
bers for the records 22, the relative fine slice numbers of 
the records within a particular coarse slice, and the rei- 

Srjf 0 ? "r" 6 * * the records a ^icular 
fine slice. It will be appreciated that the table of Fig 4 is 

sirrply a logical view of the data and associated slice 
numbers and does not represent any actual data struc- 
ture used by the database management program. Also, 
while the sample data provided in Fig. 4 will be used in 
connection with the following description and attached 
drawings, it will be appreciated that the sparseness of 
the vehicle color data is only provided for the purpose of 
simplifying the illustration of the database and that for 
tow cardinality data such as vehicle color, it is probable 
that most if not all fine slices will contain a large number 
of records having a particular data value, such as blue 
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[0021 ] For each data value of an indexed data field 
there will be one key generated for each fine slice and 
coarse slice that contains at least one record having the 
data value within the data field. For example, assuming 
Hie database contains 160.5 million records, there will 
be a minimum of 1 coarse key and 1 fine key and a max- 
imum of 6 coarse keys arid 20.063 fine keys generated 
for each item of data, with the actual number of coarse 
and fine keys depending upon the number and distribu- 
tor of the data value among the records in the data- 
base. For instance, in the sample data of Fig. 4 only 
records 32.128.000. 32.128.001. and 60.800 000 con- 
tern date value = -white" in the vehicle color field. Thus 
for this data value there will be a total of three keys- one 
coarse key (since all three records are within the same 
coarse shce) and two fine keys - one for fine slice 4016 

SSLnS'*"'? *** reCOrdS 32 - 12 8.°00 and 
32.128.001 . and one for fine slice 7.600. which contains 
record number 60.800.000. These three keys are 
shown in the sample index of Fig. 6. which will be dis- 
cussed below. 

[0022] Referring now to Fig. 5. there is shown the gen- 

™1 "L index 30 for one * the d*a «elds As 
mentioned above, the index is comprised of keys 44 and 

shce and data value with which it is associated and the 
link being used in determining which records contained 
in the slice include the data value in the associated user 
? 6aCh data value contained in »e data 

ISShTS"* 006 22 0f databas *. there 
will be at least one coarse key 44 and one fine key 44 

Very high cardinality data, such as VIN number may 

2, ! a " n9,e w coarse fi "e key. whereas low car- 
dmality data, such as gender, is likely to be found in 
every fme slice in the database and can therefore 
STSi tey, * we ^ fine 30(1 slice contained in 
? a l 2^ 8 ' ^ US ; for any partcu,ar data «eW having 

2?. J?" 8 "* ' 3 number m <* »«• slices 
and a number n of fine slices, the index will logically 
ake the form of Fig. 5. with there potentially bein?up to 
(n + m) x / separate keys and links. 

SL^l SPe 1 iC , eXanip,e ' R9 - 6d ep«s the actual 
SSL^ v ; ,c,e Wex 30 for the sample 
date from F,g. 4. The sample index includes a pluraSy 
of keys 44 and a link 46 associated with each of*!* 
key* As mentioned above, for each data value (eg 
Wack. blue. gold, green, etc.) found anywhere in the" 
datebase m .the W data field, there is provided a 
coarse key for each coarse slice and a fine key for each 
fine sice containing at least one record having that data 
value wrthin the WfieW. Thus, as exp Jed aLovl 
for the cofor whrte. there would be one coarse key ana 1 
two fine keys for the sample database. Although the 
index can be stored in various formats such « in Te 

^nTB'rerai^be 5 ! 3 " 11 6 " rtiS 
with Fig. 11. m whatever form the index is stored°the 
keys are maintained in order by ascending key value, as 
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w.11 now be described in connection with Fig. 7. 
[ ??? Ra 7 depicts the ^mat of the keys 44 
IL^V 6 ° f ,ina E3Ch key inc,udes three por- 
5 em "! '"to a single 

Son Z 9 * e k6yS ^ ,he index - 706 ** 
TZZ? " 3 810919 M «** indicat6s type of 
toy. wrth a mo indicating that it is a coarse keyand a 
one u^icating , that it is a fine key. The second portion 

10 i?f S? 68 .* 9 abS °' Ute S,ice number ^e key. wi* 
J- f !5 6 bein9 zera for a «"e key and a fine 

fZ^T**- 000 ' fi " e Slice 2 ««* coJres^on^ to 
r**rts 6.000 through 23.999. For a coarse key 

£Zh ? Q L W0Uld COrrespond to fine slices 8.000 
frrough 11,9999 (and, therefore, records 64 000 000 

key is the ke/s data value, which is simply the data 

20 SS J??* ^ **** ,ine and coarse «"«»» there 
k9y P6r date ^ no ^0 keys will be the 
same and. consequently, the three portions of the keys 
that are concatenated together provide a unique key 

« SSL? 6 J ,ndex 706 kevs ar * stored in order 5 
!f? 6ndl " 9 tey va,ue - T" 08 ' a « shown in Fig. 6 the 

5?* be9i " Wiln 3 Cleared bit > «* be 
toed mthe index before any of the fine keys (which 

begin wrth a set bit). Within the group of coaVse keyT 

«m keys will then be listed in order of slice number £ 

Tm a ^ S ' ,Sted in 0re,er 01 1,16 tevs ' data values 
2L??^, 6> fl0ld ' 9reen> ete >- which may be 
alphabetically ordered for characters and strings or 

tT^lV'V***^ ** 0rdered bv 8,106 nu "*er 
£?e£^ ; eteyShavin9 ^ same s liC enumber.by 

[0W6] As mentioned above, the link 46 associated 
„ f,ne k ey is used to indicate which records 

40 wrthin the associated fine slice contain the data value 
associated with the fine key. Referring now to Fig iTe 

IT ^ e ^ e " nk 46 iS Sh0wn " f in « «nk 4n tekl 
erther of two forms - a pointer to a bit vector or a relative 
record number (RRN). The first portion of the link com 
« prises a single b.t indicating which type of link it is A 

hi 6 l5 eared bft) indicates t" 9 ,ink is a pointer 
to a brt vector .nd a one (i.a. set bit) indicates that the 
•nk is a relative record number. The first type of link 

so ^ludesatleasttvrarecordscontainingthedatavalue 
The pointer can either be an offset in the case of the bit 
vector being stored in the same file as the index, or can 
be a filename of another file along with an offset, if nec- 
« If^u 8rth8r 9Vent tne bit vector wi " inclu de one bit 

Z^J!!^ 6 Withi " the fine slice wnich - in »e illus- 
Bated embod^errt. would be 8.000 bits, with the first bit 
indicating whether or not the first record in the fine slice 
contains the data value, the second bit indicating 
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whether or not the second record contains the data 
value, and so on for each of the remaining records in the 
fine shoe. The second type of link is used whenever 
there is only one record within the fine slice that con- 
tains the associated data valu . In that case, the fine 
link provides the relative record number of that one 
record. 

[0027] Referring back to Fig. 6 . the two fine keys for 
slice 1 provide an example of these two types of links. 

first fine key associated with slice 1 has a value of 
blue and can be represented as f:l .blue, with T indicat- 

^lar I?!!?* k !* the " 1 " bein9 lhe ■«***• a«e 
number, and "blue" being the data value to which the 

key relates. As indicated in Fig. 6. the link associated 
with key f:i thw is of type 0. meaning that the link con- 
tains a pointer to a bit vector 48. In this case bit vector 

fST"*^ S6t W 0 ° 9ical one) in bit P° sition s 0 and 
rJt T? I* ?° Siti0ns are cleared to a «ro. This indi- 
cates that relative record numbers 0 and 2 of fine slice 1 
contain Wue in the vehicle color field. Referring back to 
the sample database of Fig. 4. it will be evident that this 
is correct records 8.000 and 8.002 (which are relative 
records 0 and 2 of fine slice 1) contain the value "blue- 
in the vehicle color field. 

2?! Tur "! n9 J back t0 R 9- * «!• second key for fine 

that the link does not contain a pointer, but instead pro- 
vides the relative record number (RRN=1) of the only 
record within fine slice 1 that contains "red" in the vehi- 
cle color ,, eld. Referring again to Fig. 4. it will be seen 
that relatove record number 1 (which is absolute record 
number 8001) is in fact the only record within fine slice 
1 that contains the value "red" in the vehicle color field 
For high cardinality data such as a VIN. there will be a 
great number of keys created (since the number / of 
potential data values will be high) but very few records 
(often only one record) with the data value. Thus, where 
a slice has only a single record containing the data 

SUHES" °? 8 record number "»*» the n* 

rather ttian both a pointer and an almost 1 KB bit vector 
can^result in the saving of large amounts of storage 

[0029] Referring now to Fig. 9. the layout is shown for 
the coarse links that are associated with the coarse 

T JfJ?" the ,ine ,inkS ' *° C0a ' 8e £h£ 

TZh£° "J ich are - "Sing a single bit 

fZ^SZZ* 1*' ,in,t ^ ,irSt is by 
a zero brt and contains a pointer to a bit vector that rep- 

resents each of the fine slices within the associated 

2J? *5 ^ S8COnd ** is indicated ^ a one bit 
and s used whenever the coarse slice contains only 

2?2 S" J ** feCOrds ** cotriain *e data 
value. In this case the link contains the relative fine slice 
number (RFSN) of that one fine slice. Note that I 
on to the relative fine slice number, the second toeof 

cussed below in greater detail, is used to indicate 
whether or not all of the records within the fine S 



12 



include the data value. 

canle * theM *° of 

f ' n R9 ' 4 * 8nown herein, the value 

TJ»us. in Rg. 6 . key cttorange (coarse slice 0. orange) 

10 " J2r* 0t , hefS C,eared t0 zero - ^ "olw 

Tnl 22 2? 8185388 0f R9 ' 4 ' 11 is on| y heated 
one record within coarse slice 0. Thus, key c O areen of 

Rj 6 includes a link that is not a pointer to a 2 hS? 
,5 ^ 2U" ? re,8tiVe ,ine ^ number (RFSnTo, 

£ 11 f 6 ?* C ° ntains * e record ^ing "green- £ 
the vehicle color field. 

Sth^! sf 8fenCe t0 R9 - 10 ' * wil ' be se «" that, 
unlike the fme bit vectors 48. the coarse bit vector 48 is 

20 The f^st of these two bit vectors includes whatwiH be 
Jjjwd to as "ANY" bte. with the ANY brt v^oT ^a 
^eluding a single ANY bit for each of the 4 000 fin. 
slices contained within the coarse slice. An ANY Jfe 

25 wrthm its associated fine slice has the data valuein S 

SKSE* ,,S ** eANY «tissettoone ™ "s a* 
AW brt will be zero only if none of the records within its 

XESElK 09 COntain *• data va,ue - second 

a sinate A.f i,T *" ^ W vector 485 a,s ° 
hit = 9 5,t ,0f 8ach of 106 4 ' 000 slices. An ALL 
brt is used to indicate that all of the records contained 
wrth, n rts assocated fine slice have the data value. If so 

* ££±£7 2 ? one - tf J he data ^ is not indudad 

wrthm even a single record within the fine slice, then the 

cussed below, the ALL bit is useful in processing out 
nes involving the NOT operator 9 Q 

« ^J??^^^ ******** 

00t f 508 * in any of the reco ^ contained in a 
^' Ce ' "° ,ine tey is ™* where 

for ttioT "I* 008,86 *® v 18 Mr0, 00 ^' ne key is created 
for toe fine slice associated with that ANY bit. Similarly 

!l 6 reCOrds contained in a P arti ~'ar coarse slice no 
coarse p or fine keys are created for that coarse st 
™«s, in Rg. 6 . there is no coarse 0 key for "silver" 
because none of the first 32 million records contain^ . 

2LS!f a S,ICe 1 tey wnich has a link to its 
a tktet 8 ', C : 23> 8nd 8,ine *• 4023 k ^ whi <* has 
« ^[ 8tive reco,d wrrter 0 ( w hich is absolute 

tZl^?' 184 ' 00 ^' a88h0wn R» 4 Is 
55 SS' ^''^^erasthevehiclecolor. 

!SL " in9 n0W t0 Fi9 " 1 1 ' t"* 8 is shown the 

fo?fil v ^ f an index 30: in the index 

for f'eW color . The index is stored as a B-tree with keys 
44 stored not just at the leaves of the tre . but also at 

SSI 001337 
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the root and intermediate nodes. This provides faster 
searching of the B-tree on average, since the search 
need not traverse all levels of the tree when searchina 
for a key that happens to be located either at the root or 
an intermediate node. Also, the levels in the tree 
decrease faster due to deletions of keys that in tradi- 
tional B-tree structures, since a sole key is never left at 
a terminal node, but is instead moved up to the next 
level node. Furthermore, the keys at the root and inter- 
med.ate nodes are used to determine the search path 
through the tree. This saves storage space because the 
date stored at the root and intermediate nodes that is 
used to determine the search path through the tree is 
not duplicative of data stored at the leaves of the tree 
as in traditional B-trees. 
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Query Prn^in,, n^„T] the Indarec 

[0034] Referring now to Fig. 12. there is shown a com- 
puter system 50 for use in implementing the database 
index structure, and query processing of the present 
invention. Computer system 50 includes a computer 52 
having a microprocessor 54. RAM 56. a hard disk 58 a 
keyboard 60. and monitor 62. Computer 52 can be any 
of a number of commercially-available personal com- 
S UtSr !,^ nnin9 *" oP&afog system such as Win- 
dowsNT®. with microprocessor 54 comprising an 
Intel® Pentium® II or equivalent processor. Database 
20 is stored as a single file on a computer-readable 
memory such as a hard drive 58. Similarly, each of the 
indexes are stored on hard drive 58 as a separate file 
Hard drive 58 can comprise a fixed magnetic disk or 
other non-volatile data storage. Depending upon the 
size of database 20. the non-volatile data storage 
device may comprise a number of hard drives 58a-n 
such as in a RAID array, with the database spanning 
two or more of these hard drives. As mentioned above 
the database management program 64 is used to setup 
and maintain the database 20 and its indexes, as well 
as to perform query processing and associated retrieval 
of records using the indexes. As with database 20. data- 
base management program 64 is also stored in compu- 
ter-readable format on hard drive 58. As will be 
appreciated, computer 52 can be a server attached via 
a^network interface card (not shown) to a network 
whether .t be a local area network or a global computer 
network such as the Internet 
[0035] Query processing is implemented by computer 
SmriSL ™ r * rocessor 54 executing instructions 
rom database management program 64. Program 64 
ocates the one or more records that satisfies a particu- 
ter user query by creating a target keys (e.g.. c:0.tlue) 
for each coarse and fine slice and then searches the 
app^nateindexfor those target keys, starting with the 
towest key valued key (i.e.. coarse slice 0). If no key is 
found, a brt vector of all zeros is returned. If a matching 
key ,s ; found in the index, then the associated link is 
used to obtain a bit vector for that key. If the link is of 



type 0 as shown in Figs. 8 and 9. then the bit vector 
jdentf,ed by the link is returned. Where one or botTof 

£ K 1 : * at iS " * * contain a rela 

tive fine slice number (in the case of a coarse key) or a 
5 refatrve record number (in the case of a fine key) rather 
ttian a pointer to a bit vector, then a bit vector is created 
^'J, 0 ! 3 !" 6 bit V8Ctor - ** Wt corresponding to the 
ng bits of the vector being cleared to zero. When creat- 

rA^^ {whi * includes ** ANY brts 
and ALL bits), the ANY bit corresponding to the fine 
sl.ce number identified by the link is set to one. wiJi *e 
remaining ANY bits being cleared to zero, and Zall 

« 5l?T eSPOndin9 to the fine **» number Wentified by 
is the link is set to the same value (0 or 1) as the ALL brt 
comamed in the .ink. with the other ALL bits being 

aSbL°« er °J n thiS Way ' qU6ry can 
ahrays be earned out using bit vectors, regardless of 
wh.ch type of link is stored in the index 

20 fSL! ntheMM ^^ le< l ueri es.suchasMAKE = 
Chevrolet once a coarse bit vector has been obtained 
it can be used to determine which fine slices contain 

Scl r^"^ qU6ry - 7718 for *om Se 
slices (e.g.. fttChevrolet) can then be accessed in 

obtained. As records containing the data value are iden- 
tified, they are retrieved for processing. 
[0037] For boolean operations, such as would be 
required for a query of MODEL = Corvette and YEAR = 

^Jf-~ rre ^"dingbitve<torsforeachofthekeyword 
search terms are obtained in the manner described 

^ a " d * en are "Really combined in accordance 
wrth the boolean Logic (AND) specified in the user's 
query The following operators are used to perform 
as boolean operations on the bit vectors: 

ANDBV 

[0038] This is a binary operator, taking two bit vector 

fl^Tl' "? retUming a sin 9' e bit ***** result All 
brts of the two bit vectors are logically AND-ed together 
yielding a single resutt bit vector. The operation! Z 
teal for coarse and fine bit vectors. 

45 OR_BV 

[0039] This is a binary operator, taking two bit vector 
pararneters. and returning a single bit vector result All 
MB of the two bit vectors are logically OR-ed together 
so yielding a single result bit vector. The operation is iden- 
tical for coarse and fine bit vectors. 



NOT_BV 



ss [0040] This is a unary operator, taking one bit vector 
parameter, and returning a single bit vector result All 
bits of the one bit vector are logically NOT-ed (comple- 
mented), yielding a single result bit vector. The opera- 
SSI 001338 
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tion is different for coarse and fine bit vectors. For fine 
bit vectors, all bits are simply NOT-ed in place For 
coarse bit vectors, the ANY and ALL bits are NOT-ed 
and than > the ANY and ALLbit vectors are swapped- that 
is. the ALL bit vector is moved to the left portion of the 
coarse bit vector as shown in Fig. 10 and thereby 
becomes the ANY bit vector for the NOT-ed coarse bit 
vector. Similarly, the ANY bit vector is moved to fight 
port.cn of the coarse bit vector so that it becomes the 
ALL bit vector of the NOT-ed coarse bit vector. 
[0041] The following are basic "find" operators that are 
used in searching through an index to obtain a bit vector 
for specified target keys or key ranges. 

RND_EQUAL_BV 

[0042] This operator searches an index to find the key 
that matches the specified target key. which is a param- 
eter to this operator. There will be at most one entry in 
the B-tree matching the target key. If the target kev 
doesn't exist, a bit vector of all zero bits is created and 
returned. At the end of this operator, a current path 
structure is created pointing to the location in the B-tree 
where the target entry was found, or where it would 
have been found if it had existed, and the target key is 

ZULU?" bY the nND - NEXT -BV operator dis- 
cussed below. 
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atrate In" ™ ^Z*™ m " >cU-d mul,i P ,e «™s to 
EE! ™ 6 * an 006 S,iCe Type or Slice 

s FINO_LSS_SLICE 

[0045] This operator searches an index to find all 
thT^T ° ata Va,ue < see R 9- 7) * lesbian 

» SLS^ 1 "? ke * which is 3 parameter to « 

operator. It operates as follows. A RND EQUAL BV 

urtSrJl - 0t> ""* iS a * ecutad continuously 

is ? 3 C ° mp,eted result - Each ■■«** result 

» brt vector is logically OR-ed with the OR BV operate 

S fte^OT pv reSUlt bft V6Ct0r iS ,09ica "* NOT "^ 
ZVS ^ " ° Peiator - 1116 "ND.LSS SLICE 
^ result bit vector is returned as the operator result 

FIND_LEQ_SLICE 



FIND_NEXT_BV 

[0043] This operator searches an index to find the next 
key whose field Slice Type and Absolute Slice Number 
values (see Fig. 7) match the target key's field Slice 
Type and Absolute Slice Number values. Thus, the kev 
data value portion of the key is ignored. The target key 

HND_EQUAL_BV operator. There can be any number 
of entnes in the B-tree matching the target key. ff the 
next target key does not exist, a special completed 
result value is returned to indicate that no more entries 
exist matching the target key. The search starts from the 

nSS^eSuSr^ by ** ,ast previous 

-J^ AL - BV 0perator ' or lJ P dated by the last pre- 
vious RND_NEXT_BV operator. At the enS of this oper- 
ator the current path structure is updated to point to the 
location ,n the B-tree where the next target key was 

S 44 L? ef0 " OWin9 3,6 relational r,nd " operators that 
are used to search through an index. Each execution of 
one of these operators finds one or more target keys in 
an ,ndex. and returns the bit vector, either coarse or 
associated with the target keys. If multiple keys are 
S^rf 800 '** bit vectors are "ogioa'ly OR-ed 

2?, to J* m 3 Sin9 ' e result bit vector - ™s ^gle 
result bit vector ,s an accumulation of all the keys found 

T^T^J*? ° Perator ""a* °" • **• Slice 
Type value and Absolute Slice Number value. Depend- 
ing on theoperator. it executes on one or more Key Data 



[0046] This operator searches an index to find all 

25 03,3 V3,Ue iS ,ess or «£ to 

?"!^ rf ^ te ^ ttey '^ ich is ^ parameter to ihfe 
operator, ft operates as follows. A RND EQUAL BV 

T^e nJS^jr* " V6Ct0r iS S6t ,0 311 zsro bite - 
so !Z^£- NE *T-BV operator executed continuously 

b?l^ f 3 COmpleted result - Each result 
brt vector is logically OR-ed with the OR BV ooerator 

SSd V^ L P EQ - SL,CE resu,t 

reSUlt bit VeCt0r is '°9 ica,| y N OT-ed 
with the NOT.BV operator. The RND LEQ SLICE 
35 resuft brt vector is returned as the operatoTresuf 

RND_EQL_SLICE 

[0047] This operator searches an index to find all 
Cr^^ eyDateVa,ueis ^ a| toth eS pSiS 
target key. which is a parameter to this operator 
ates as follows A RND.EQUAL.BV operator ist l 
FMJ EQL SLICE * 8 



RND_PEQL_SUCE 

so [0048] This operator searches an index to find all 
ertnes whose Key Data Value is partial* equV toVe 

*?■ *** iS 3 P^meter to mis 
ZZX* equa L means tnat 1,16 entry Key Data Value 

55 ^l^^^^^^y^thelengthofthetar. 

EF ? ata ^ a,Ue (the tar9et ^ is a partial Key 

tHeSS^'m,* ° PerateS 38 fo,,ows - 
RND.EQUAL.BV operator is executed on the targ t 

key. generating an initial RND_PEQL_SUCE result bit 

SSI 001339 
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vector. The FIND_NEXT_BV operator is executed con- 
tinuously until it returns a completed result, or the next 
key's Key Data Value is greater than the target Key Data 
Value. Each execution result bit vector is logically OR- 
ed I with the OR_BV operator into the 
RND_PEQL_SUCE result bit. vector. The 
F1ND_PEQL_SLICE result bit vector is returned as the 
operator result. 

FIND_NEQ_SLICE 

[0049] This operator searches an index to find all 
entries whose Key Data Value is not equal to the speci- 
fied target key. which is a parameter to this operator It 
operates as follows. A FIND_EQUAL_BV operator is 

S2??^ 0, l the target key " g^erating an initial 
™°- NEQ - SL,CE result bit vector. The 
F1ND_NEQ_SLICE result bit vector is logically NOT-ed 
with the NOT.BV operator. The FIND NEQ_SUCE 
result bit vector is returned as the operator result 



FIND_EQUAL_BV operator is executed on the mini- 
mum target key. generating an initSl 

RND NEXT BV UCE ^ '« ™ 
i 7 I - ° pemtor is ax8cuted continuously 
5 untu it returns a completed result, or the next Key Data 
Value is greater than the maximum target key's Key 
Data Value. Each execution result bit vector is logically 
OR-ed with the OR BV operator into tha 
RND_RANGE_SL.CE Tesult bit vectoT £ 

10 L " RA f GE - SUCE resu,t bit vector is ^rned as 
the operator result. 

Retrieval of Record^ 



FIND_GEQ_SLICE 

[0050] This operator searches an index to find all 
emr.es whose Key Data Value is greater than or equal to 
the specified target key, which is a parameter to this 
operator. It operates as follows. A FIND EQUAL_BV 

Sr^M^^ 00 the teraet ke * generating an ini- 
1 ™D_GEQ_SLICE result bit vector. The 

Jj, " " BV 0perator is «»cutad continuously 
until it returns a completed result Each execution result 
bit vector is logically OR-ed with the OR BV operator 

l^ nND - GEQ - SL,CE result The 
Fl ND_GEQ_SLICE result bit vector is returned as the 
operator result. 

FIND_GTR_SUCE 

[0051] This operator searches an index to find all 
errtr.es whose Key Data Value is greater than the spec- 
ified target key. which is a parameter to this operator It 
operates as follows. A FIND_EQUAL_BV operator" is 
executed on the target key. The initial 

TneFIND_NEXT_BVoperatoris executed continuously 
until rt returns a completed result Each execution result 
bit vector is logically OR-ed with the OR BV ooerator 
into the FIND_GTR_SL.CE result b^cT ^e 
FIND GTR.SUCE result bit vector is returns ihe 
operator result. 

FIND_RANGE_SLICE 

E^V^** S6arChes 30 index <«> find all 
entries whose Key Data Value is greater than or equal to 
he specrf.ed minimum target key. and less than or equal 
to the specrf led maximum target key. which are parame- 
ters to this operator. It operates as follows A 



° 531 ^f aresult <" the query process, a list of abso- 
lute record numbers is generated, with the list re D re- 

d!^ 9 3 5** a a " * the records ^ned in the 
database. The records are listed in record number order 
as an inherent result of the index structure and query 
20 P roc ess.ng techniques described above. The refrieval 

20 the list of records generated as a result of the query 
processing. The retrieval operation described is 
designed to provide very fast retrieval response 

2? 1 . th9 preferred embodiment, there are two 
toes of retrieval operations: COUNT and FIND. A 
COUNT retneval simply counts the selected subset of 
records. Since the records themselves don't need to be 

so I* T ra *° n iS extreme, y fast - 106 retrieval 

so operation to select and COUNT a subset of records 
from a database is shown in Fig. 13. A FIND retrieval 
retrieves each of the records in the subset The retrieval 
operation to select and FIND a subset of records from a 
database is shown in Fig. 14. Since the selection oper- 
as ation rapidly identifies the selected records, the retrieval 
bmeis in general proportional to the number of records 
selected, not the number of records in the data record 

Aunic ' u e aspect of the retrieval operation is 
40 that rt operates by examining the indexes in record 
number order first and then field value order, rather 
ttian held value order first and then record number 

S 6 ii Z ^ ** iS P0ssib,e is ** 38 «*°wn ^ 
« 1 If **** iS or 9 anized ordered for 
« efficient index-sequential operation in this search order 
A giver, slice and data value combination can be effh 
cently located in the index B-tree. This is accomplished 
uang normal B-tree search methods, starting at the top 

so nil* ^JH^ Uti "' zin9 3 binaf y search in each tree 
so node, until the target index entry is found. Since all of 
the field values for the given slice are stored in sequen- 
tial entries in the B-tree immediately preceding and fol- 
lowing the target key. they may be retrieved easily and 
« ^ em i^ icWv - A second reason this is possible is 
ss the combination of coarse and fine bit vectors As a 
result of this combination, a single coarse bit vector rep- 
resents 32.000.000 records. This means that by 
processing a single coarse slice for the retrieval criteria. 

SSI 001340 
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32,000,000 records have been processed, and the fine 
slices of interest in the current coarse slice have been 
identified. From there only the fine slices of interest are 
directly accessed and processed 8.000 records at a 
time. Then only the individual records of interest are 
counted or retrieved. A third reason this is possible is 
the ability to logically NOT a bit vector rapidly. For exam- 
ple, the NOT EQUAL criteria is implemented by finding 
the bit vector EQUAL to the criteria, and NOT-ing this bit 
vector. This is much faster than finding all the bit vectors 10 
NOT EQUAL to the criteria. The special capability of a 
coarse bit vector to contain both "ANY" and "ALL" bit 
vectors allows the NOT operation to work as effectively 
on coarse bit vectors as on fine bit vectors. Yet a fourth 
reason this is possible is the ability to rapidly evaluate a « 
LESS THAN. LESS THAN OR EQUAL. GREATER 
THAN, GREATER THAN OR EQUAL, or RANGE crite- 
ria by simply OR-ing together bit vectors stored sequen- 
tially in the index. Since each coarse bit vector 
represents 32.000.000 records, it is much faster to enu- „ 
merate the Key Data Values within a slice this way. than 
to find the Key Data Values first, and then enumerate 
the record numbers. Thus, the retrieval criteria is quickly 
processed in record number order 32.000.000 records 
at a time. Even with a record count of 1 billion records « 
only 32 coarse slices would need to be examined. 
[0056] During execution of the criteria code, the valid- 
ity index (see Fig. 2) may be used by the retrieval oper- 
ation. The validity index is a standard index with a one 
bit for every in-use data record, and a zero bit for every » 
deleted data record. If a NOT_BV operator is executed 
at any time in the criteria code, a flag is set specifying 
that the validity index is needed. This flag is needed 
because a zero bit in a bit vector represents deleted 
data records as well as data records not matching the as 
current crrteria. If this flag is set at the end of execution 
of the query processing, the validity index bit vector for 
the current coarse or fine slice is AND-ed in to the cur- 
rent result bit vector with the ANDBY operator This 

£ n Hiot eS D w elet8d reCOndS whfch were introduced by 4o 
the NOT.BV operator from the final result bit vector 
[0057] The database must be locked against update 
during certain portions of the retrieval operation The 
retrieval operation is optimized to reduce the number of 

!H e i and durati0n of ** ****** 708 database is « 
locked with a shared-lock (reader lock) only during exe- 
cution of the query process. This allows any number of 
other retrieval operations on the table to proceed con- 
currently, while temporarily locking out update opera- 
tions. The database is locked at the beginning of the 50 
query processing, and unlocked at the end of the query 
processing. Since thequery is executed a slice at a time 
(via the bit vector mechanism), a single lock-unlock 
cycle covers query processing for 32,000.000 records 
tor coarse slices, and 8.000 records for fine slices The 55 
design of the index makes criteria code execution very 
simple and quick. In this way. the number of lock-unlock 
cycles is minimized, and the duration of the time the 
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database is locked against update is minimized 

2?2L- n, - e d !? aSe fe Unl0Cked dUrinsj loadi n9 
and processing of retneved data records. This creates 

ttie possibility of a data record being modified between 
the time the query is executed, and the time the data 
record is loaded. This means that the actual data record 
loaded could no longer match the retrieval criteria This 
problem is avoided by assigning a unique update trans- 
action number to each data record update. This update 
transaction number is stored in the data record itself 
The then-current update transaction number is captured 
and stored by the retrieval operation during the execu- 
tion of the query. When a data record is loaded during a 
retrieval operation, its update transaction number is 

W HT re !L a9ainSt the query ' s "P*** transaction 
number. If the data record update transaction number is 
higher, it means the data record has been updated 
ance , the execution of the query, and may no longer 
match the retrieval criteria. 

[0059] When this condition is detected, the updated 
data record is not processed. Instead, the retrieval oper- 
ation .s interrupted, and restarted at the current coarse 
and fine slices. The query is re-executed for the current 
coarse slice and then for the current fine slice. Process- 
ing of the current fine slice is then restarted at the bit 
represented by the record number which was being 
loaded and processed at the time of the interruption 
This mans that the data record is consistent with the 
newly executed query, and processing resumes (unless 
fte data record has been updated again, in which case 
the interruption/restart will be repeated). Thus, the 
retneval operation can evaluate retrieval criteria for 
large numbers of records at a time, with minimal locking 
while providing consistency of results. The method of 
retrreval operation processing in record number order 
provides the ability of interrupting and restarting 
retrieval at any record number location simply and effec- 

[0060] it will thus be apparent that there has been pro- 
vided in accordance with the present invention a data- 
base method and apparatus which achieves the aims 
and advantages specified herein. It will of course be 
understood that the foregoing description is of a pre- 
ferred exemplary embodiment of the invention and that 
the .nvention is not limited to the specific embodiment 
shown. Various changes and modifications will become 
apparent to those skilled in the art. For example to con- 
serve storage memory, the size of the data value por- 
tions of the keys can be made variable rather than being 
fixed at a size selected to accommodate the larger data 
values. All such variations and modifications are 
intended to come within the scope of the appended 



Claims 



1. 



A computer-readable memory for storing a data- 
base and indexes used to locate data within the 
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database, comprising 

a non-volatile data storage device; 

a database comprising a plurality of records 
stored on said data storage device in a compu- 
ter-readable format, each of said records hav- 
ing a number of data fields with at least some of 
said data fields having a data value stored 
therein; 

wherein said records are logically separated 
into groups of records with each group contain- 
ing a preselected maximum number n of 
records, and wherein said groups of records 
are logically organized into one or more sets, 
with each set containing a preselected maxi- 
mum number m of groups, whereby each set 
contains a maximum of n*m records; 

a plurality of indexes, each of which is associ- 
ated with a different one of said data fields and 
each of which comprises a number of keys 
associated with said one data field, whereby at 
least some of said data fields are indexed; 
wherein each of said data values that are 
stored within an indexed data field has one or 
more fine keys and one or more coarse keys 
associated therewith, wherein said fine keys 
are each associated with one of said groups 
and with a fine bit vector that identifies which of 
the records contained within that group include 
the data value associated with that key. and 
wherein said coarse keys are each associated 
with one of said sets and with a coarse bit vec- 
tor that identifies which of said groups include 
at least one record having that data value 
stored therein. 

2 r A computer-readable memory as defined in claim 1 
wherein said database includes a certain number k 
of records, wherein k > n and wherein all of said k 
records are physically maintained on said data stor- 
age device as a single file, whereby more than one 
group of records are stored together as a single file. 

3. A computer-readable memory as defined in claim 2 
wherein all of said records of said database are 
stored in a single file. 

4. A cornputer-readable memory as defined in daim 1 
wherein said indexes include a link for each key in 
the index, whereby each of said links is associated 
with adata value, and wherein at least some of said 

vertore ,ndiCate * 1e l0Cati0n * °" 9 of *** bit 

5. Acornputer-readable memory as defined in claim 4 
wherem said keys include a first portion that identi- 
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ties whether it is a group key or a set key. a second 
portion that identifies the group or set to which it 
corresponds, and a third portion that indicates the 
data value to which it corresponds. 

5 

6. A computer-readable memory as defined in claim 5 
wherein said keys are stored in order based upon 
the contents of said first, second, and third portions. 

wherein said f.rst portion comprises the left most 
portonof each key. said second portion comprises 
tte rnKWie portion of each key. and said third por- 
, $ *on composes the right-most portion of each key 

8. Acornputer-readable memory as defined in claim 7 
wherein said first portion comprises a single bit. ' 

9. ^computer-readable memory as defined in claim 4 

2? J"? 6 l Ch ? 8aid fine keys is associated with a 
fine hnk and wherein at least some of said fine links 
each provide a pointer to a corresponding one of 
said fine bit vectors. 

« 10. Acomputer-readablememoryasdefinedinclaims. 
wherein at least one other of said fine links identi- 
fies a single record within the group of records 
associated with that fine link. 

30 11 ' t C ?2^' n * UBi * memof y as defined in claim 
10, wherein each of said fine links includes at least 
one bit that indicates whether the link includes a 
pointer to a bit vector or an identifier of said single 
record. 

35 

12. A computer-readable memory as defined in claim 4 
wherein each of said coarse keys is associated with 
a coarse link and wherein at least some of said 
coarse links each provide a pointer to a corre- 

40 sponding one of said coarse bit vectors. 

13. A computer-readable memory as defined in claim 

12. wherein at least one other of said coarse links 
identifies a single group within the set of groups 

45 associated with that coarse link. 

14. A computer-readable memory as defined in daim 

13. wherein each of said coarse links includes at 

least one bit that indicates whether the link includes 

» a pointer to a bit vector or an identifier of said single 
group. 3 

15. A computer-readable memory as defined in daim 

I .*/ er6in each of 0065 oi 831(1 coarse links 
mat identify a single group include at least one bit 
that identifies whether or not all of the records 
within that single group contain the data value 
assodated with the coarse link. 
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1 6. A computer-readable memory as defined in claim 1 
wherein each of said indexes are stored as a B- 
tree. 



17. A computer-readable memory as defined in claim s 
16, wherein said B-tree includes a root node, a plu- 
rality of intermediate nodes, and a plurality of 
leaves and wherein some of said keys are stored at 
said leaves and others of said keys are stored at 
said root and intermediate nodes. 10 

18. A computer-readable memory as defined in claim 1 
wherein said non-volatile data storage device com- 
prises a plurality of fixed magnetic disk drives 
wherein said database is stored as a single file that is 
spans at least two of said fixed magnetic disk 
drives. 

9. A computer-readable memory as defined in claim 1 
wherein said indexes are each stored in a separate 20 
file on said non-volatile data storage device. 



25 



30 



35 



40 



45 



50 



55 



EP0 961 211 A2 




SSI 001344 



EP0 961 211 A2 




SSI 001345 



EP 0 961 211 A2 




SSI 001346 



EP0 961 211 A2 



FIG. 4 

Sample Vehicle Color Data 
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FIG 5 

Index Structure 
44 KEY 



30 



LINK 




SSI 001348 



EP0S61 211 A2 



FIG. 6 



Index for Field "Color" 
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FIG. 7 

Key Layout. ^—443 
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(0 = Coarse; 1 = Fine) 1 (Coarse or Fine) 



Key Data Value 



FIG. 8 

Fine Link Layout 



Link Type 
(0 = Bit Vector) 




Pointer to Fine Bit Vector 
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(1 = Single Record) 



— or — 



Relative Record Number (RRN) 



FIG. 9 

Coarse Link Layout 
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(0 = Bit Vector) 



Pointer to Coarse Bit Vector 
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Link Type 
(1 = Single Slice) 



— or — 




"ALL" Bit 



Relative Fin Slice Number (RFSN) 
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FIG. 12 
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Structure of Index for Field "Color" 
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RPSN = Relative Fine Slice Number 
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FIG. 7.9 

Count Process for Retrieval of 
Records based Upon a User Query 
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FIG. 14 

Retrieval of Records 
based Upon a User Query 



0 



Compute number of coarse slices nc in the table- 
set current absolute coarse slice number cs to o' 




Add 1 toes: 
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